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5^ I In this paper we present a neural network based estimator system which 

ri . performs well the frequency extraction from unevenly sampled signals. It 

O ■ uses an unsupervised Hebbian nonlinear neural algorithm to extract the 

I ' principal components which, in turn, are used by the MUSIC frequency 

O I estimator algorithm to extract the frequencies. 

,y , We generalize this method to avoid an interpolation preprocessing 

C/3 ■ step and to improve the performance by using a new stop criterion to 



- - ■ 1 Introduction 



avoid overfitting. 

The experimental results are obtained comparing our methodology 
with the others known in literature. 



Periodicity analysis of unevenly collected data is a relevant issue in several 
scientific fields. Classical spectral analysis methods are unsatisfactory to solve 
the problem. In this paper we present a neural network based estimator system 
which performs well the frequency extraction from unevenly sampled signals. 
It uses an unsupervised Hebbian nonlinear neural algorithm to extract the 
principal components of the signal auto-correlation matrix, which, in turn, are 
used by the MUSIC frequency estimator algorithm to extract the frequencies |^ , 
[ pi] , p5[ |. We generahze this method to avoid an interpolation preprocessing 
step, which generally adds high noise to the signal, and improve the system 
performance by using a new stop criterion to avoid overfitting problems. The 
experimental results are obtained comparing our methodology with the others 
known in literature (see |6|,§,[|l0),§,@,(|). 



2 Evenly and unevenly sampled data 

In what follows, we assume a; to be a physical variable measured at discrete 
times ti. x(ti) can be written as the sum of the signal Xs and random errors 
R: Xi = x{ti) — Xs{ti) + R{ti). The problem we are dealing with is how to 
estimate fundamental frequencies which may be present in the signal Xs{ti) pi, 

§, i- 

If X is measured at uniform time steps (even sampling) there are a lot of 
tools to effectively solve the problem which are based on Fourier analysis H , [H| , 
lOl. These methods, however, are usually unreliable for unevenly sampled data 
4l. For instance, the typical approach of resampling the data into an evenly 
sampled sequence, through interpolation, introduces a strong amplification of 
the noise which affects the effectiveness of all Fourier based techniques which 
are strongly dependent on the noise level. 

To solve the problem of unevenly sampled data, we consider two classes of 
spectral estimators: 

Spectral estimators based on Fourier Trasform (Least Squares methods); 

Spectral estimators based on the eingevalues and eingevectors of the covari- 
ance matrix (Maximum Likelihood methods). 

Classic Periodogram M , g], [|lO|, Lomb's Periodogram |0], Scargle's Peri- 
odogram [ 14[, DCDFT 0are the methods of the first class that we use, while 



MUSIC im, and ESPRIT || belong to second class. 

The methods based on the covariance matrix are more recent and have great 
potentiality. Starting by this consideration, we develop a method based on the 
MUSIC estimator. It is compared with classic methods to highlight the results. 

3 The Neural Estimator 

In the last years several papers dealed with learning in PC A neural nets 0, 
|I3| , [^, Q, [|l5| finding advantages, problems and difficulties of such neural 
networks. In what follows we shall use a robust hierarchical learning algorithm 

Wfe+i(i) = Wfe(i) +^fe.g(yfe(?;))efc(i), (1) 

i 

Bkii) = Xfc -^yfc(j)w/c(j) (2) 

where Wfc(i) is the weight vector of the i — th output neuron at step k, yk{i) 
is the corresponding output, /i^ is the learning rate and g{t) — tanh {at) is 
learning function because it has been experimentally shown that it is the best 
performing one in our problem |ll| , |f q ]. 

Our neural estimator (ne) can be summarized as follows: 

1 Preprocessing: calculate and subtract the average pattern to obtain zero 
mean process with unity variance. 

Interpolate input data if it is the case. 



2 Initialize the weight matrix and the other neural network parameters; 

3 Input the k — th pattern x/j — [x{k), . . . , x{k + A^ + 1)] where N is the 
number of input components. 

4 Calculate the output for each neuron y{j) = w (j)xi \/i — 1, . . . ,p. 

5 Modify the weights Wfc+i(i) = Wfc(i) + ^kg{yk{i))ek{i) Vi = 1, . . . ,p. 

6 If convergence test is true then goto STEP 8. 

7 fc = fc + 1. Goto STEP 3. 

8 End. 

9 Frequency estimator: we use the frequency estimator MUSIC. It takes as 
input the weight matrix columns after the learning. The estimated signal 
frequencies are obtained as the peak locations of the function of following 
equation §, §, ll5 : 
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where w(i) is the i— th neural network weight vector after learning, and 
e? is the pure sinusoidal vector. In the case of an interpolation preprocessing 

e¥ — [l,e-'r^,...,e''r^ ~ ']-^ . In the generalization to non interpolated 

input data, ej = [1, ei '"■'*° , . . . , ei '^ (i-i)jff where {to.ti, ...,t(L_i)} are the 
first L components of the temporal coordinates of the uneven signal. 

When / is the frequency of the i— th sinusoidal component, / = /,;, we 
have e = Bi and Pmusic — *• oo. In practice we have a peak near and in cor- 
rispondence of the component frequency. Estimates are related to the highest 
peaks 

Furthermore, to optimize the performance of the PCA neural networks, we 
stop the learning process when X^f^i \^f'^{'^)\'^ < ^ ^/i so avoiding overfitting 
problems. In fact leaving the stop condition used in the ne causes to the ne to 
find periodicities not present in the signal, while the new condition preserves 
it from this problem. A simple example is illustrated in figure 2 where we can 
see how the frequency identification varies depending on the stop condition 
(see next section for signal information). In fact, without the early stopping 
(see figure 2.b), A/ after 100 epochs remains at a value between 0.20 and 0.25 
and we cannot know when the system reaches the best performance. The new 
stopping criterion, instead, permits to A/ to have a final value about 0.0 just 
after 50 epochs (see figure 2. a). 



4 Experimental Results 

Many experiments on synthetic and real signals were made, and in this paper 
we present the results obtained with one specific real signal, which highlights 
the main features of our problem. 

The real signal is related to the Ccpheid SU Cygni |g|. The sequence was 
obtained with the photometric technique UBVRI and the sampling made from 
June to December 1977. The light curve is composed by 21 samples, and has 
a period of S.S"*, as shown in figure 1. 

The first experiment is concerning the interpolation. In this case we apply 
three different methods by using the Signal Processing Mathlab ® Toolbox: 
linear, cubic and spline, because they are quite simple and the most used ones. 
In figure 3 there is a plotting of the interpolating functions and the frequency 
estimates obtained by the ne with the spline interpolated signal as input. 

In this case, the parameters of the ne are: N = 10, p = 2, a = 20, jj, = 
0.001. The estimate frequency interval is [0(1/ JD), 0.5(1/ JD)]. The estimated 
frequency without interpolation is 0.260 (1/JD). 

A comparison is made with the other methods cited in a previous section 
and the experimental results are shown in figure 4 and in table 1. Only the 
Lomb's Periodogram is in agreement with the right periodicity, but showing 
some spurious peaks. Furthermore, if we enlarge the frequency window for the 
two best performing methods, while the ne continues to work well, the Lomb's 
periodogram does not work at all as illustrated in figure 5. 

5 Concluding Remarks 

In this paper we have illustrated an improved technique based on PCA neural 
Networks and MUSIC to estimate the frequency of unevenly sampled data. It 
has been shown that it obtains good results on real data (here we used the SU 
Cygni light curve) compared with other well-known methods. In fact, it obtains 
a good estimate of the signal frequency also with few unevenly sampled inputs, 
it reduces the noise problems related to input data interpolation, it optimizes 
the convergence by introducing an early stopping criterion, and, finally, it is 
more resistant to the dimension of the frequency windows. 

Future research lines regard the introduction of genetic algorithms to op- 
timize the weight initialization of the PCA neural networks and to use filters 
to extract and identify one frequency at each time when dealing with multi- 
frequency signals. 
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fig.(l) Light curve of SU Cygni. 




fig. (2. a) Learning with early stopping condition; 
Af = (real irequency - estimate irequency). 




fig.(2.b) Learning without early stopping 

condition; Af = (real frequency - estimate 

frequency). 
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fig. (3. a) Linear interpolation of SU Cygni 
light curve. 
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fig.(3.b) Spline interpolation of SU Cygni 
light curve. 
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fig.(3.c) Cubic interpolation of SU Cygni 
light curve. 
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fig.(3.d) ne estimate of SU Cygni light 
curve with spline interpolation. 
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fig. (4. a) Lomb's Periodogram estimate 
of SU Cygni with confidence intervals. 
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fig.(4.b) ne estimate of SU Cygni. 
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fig.(4.c) DCDFT estimate of SU Cygni. 



fig.(4.d) ESPRIT estimate of SU Cygni. 




Froqusncy 

fig. (5. a) Lomb's Periodogram estimate 
of SU Cygni (with enlarged window) 
with confidence intervals. 



Frequency 

fig.(5.b) ne estimate of SU Cygni (with 
enlarged window). 



Algorithm 


Frequency 


Estimates 


Lomb' s Periodogram 


0.260 


(1/JD) 


DCDFT 


0.015 


(1/JD) 


ESPRIT 


0.25 


(1/JD) 


ne estimate 
(with inteipolation) 


0.015 


(1/JD) 


ne estimate 
(without interpolation) 


0.259 


(1/JD) 



Table 1. Frequency estimates with the methods illustrated in the paper 
on the SU Cygni light curve. 
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